8 research outputs found

    Cross-Platform Question Answering in Social Networking Services

    Get PDF
    The last two decades have made the Internet a major source for knowledge seeking. Several platforms have been developed to find answers to one's questions such as search engines and online encyclopedias. The wide adoption of social networking services has pushed the possibilities even further by giving people the opportunity to stimulate the generation of answers that are not already present on the Internet. Some of these social media services are primarily community question answering (CQA) sites, while the others have a more general audience but can also be used to ask and answer questions. The choice of a particular platform (e.g., a CQA site, a microblogging service, or a search engine) by some user depends on several factors such as awareness of available resources and expectations from different platforms, and thus will sometimes be suboptimal. Hence, we introduce \emph{cross-platform question answering}, a framework that aims to improve our ability to satisfy complex information needs by returning answers from different platforms, including those where the question has not been originally asked. We propose to build this core capability by defining a general architecture for designing and implementing real-time services for answering naturally occurring questions. This architecture consists of four key components: (1) real-time detection of questions, (2) a set of platforms from which answers can be returned, (3) question processing by the selected answering systems, which optionally involves question transformation when questions are answered by services that enforce differing conventions from the original source, and (4) answer presentation, including ranking, merging, and deciding whether to return the answer. We demonstrate the feasibility of this general architecture by instantiating a restricted development version in which we collect the questions from one CQA website, one microblogging service or directly from the asker, and find answers from among some subset of those CQA and microblogging services. To enable the integration of new answering platforms in our architecture, we introduce a framework for automatic evaluation of their effectiveness

    Journalists and Twitter: A Multidimensional Quantitative Description of Usage Patterns

    No full text
    We conduct a large scale quantitative comparison of the usage pattern of a microblogging service by journalists, news organizations, and news consumers. Through two statistical tests of eighteen numerical features over 5,000 news producers and 1 million news consumers, we find that Arab journalists and English news organizations tend to broadcast their tweets to a large audience; that English journalists adopt a strategy of targeted and engaging communication; that journalists are more distinguishable in the Arab world than in the European English speaking countries; that print and radio journalists have a very dissimilar behavior while the television ones share some characteristics with each of them; and that British and Irish journalists are similar to a large extent. This paper is the first to provide a multidimensional bird's-eye view on the usage pattern of journalists over Twitter

    What Questions Do Journalists Ask on Twitter?

    No full text
    Social media platforms are a major source of information for both the general public and for journalists. Journalists use Twitter and other social media services to gather story ideas, to find eyewitnesses, and for a wide range of other purposes. One way in which journalists use Twitter is to ask questions. This paper reports on an empirical investigation of questions asked by Arab journalists on Twitter. The analysis begins with the development of an ontology of question types, proceeds to human annotation of training and test data, and concludes by reporting the level of accuracy that can be achieved with automated classification techniques. The results show good classifier effectiveness for high prevalence question types, but that obtaining sufficient training data for lower prevalence question types can be challenging

    What questions do journalists ask on Twitter?

    No full text
    Social media platforms are a major source of information for both the general public and for journalists. Journalists use Twitter and other social media services to gather story ideas, to find eyewitnesses, and for a wide range of other purposes. One way in which journalists use Twitter is to ask questions. This paper reports on an empirical investigation of questions asked by Arab journalists on Twitter. The analysis begins with the development of an ontology of question types, proceeds to human annotation of training and test data, and concludes by reporting the level of accuracy that can be achieved with automated classification techniques. The results show good classifier effectiveness for high prevalence question types, but that obtaining sufficient training data for lower prevalence question types can be challenging. 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org).This work was made possible by NPRP grant# NPRP 6-1377-1-257 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.Scopu

    DavidDLewis.com

    No full text
    It is common to develop and validate classifiers through a process of repeated testing, with nested training and/or test sets of increasing size. We demonstrate in this paper that such repeated testing leads to biased estimates of classifier effectiveness. Experiments on a range of text classification tasks under three sequential testing frameworks show all three lead to optimistic estimates of effectiveness. We calculate empirical adjustments to unbias estimates on our data set, and identify directions for research that could lead to general techniques for avoiding bias while reducing labeling costs

    Blogs as a Collective War Diary

    No full text
    Disaster-related research in human-centered computing has typically focused on the shorter-term, emergency period of a disaster event, whereas effects of some crises are longterm, lasting years. Social media archived on the Internet provides researchers the opportunity to examine societal reactions to a disaster over time. In this paper we examine how blogs written during a protracted conflict might reflect a collective view of the event. The sheer amount of data originating from the Internet about a significant event poses a challenge to researchers; we employ topic modeling and pronoun analysis as methods to analyze such large-scale data. First, we discovered that blog war topics temporally tracked the actual, measurable violence in the society suggesting that blog content can be an indicator of the health or state of the affected population. We also found that people exhibited a collective identity when they blogged about war, as evidenced by a higher use of firstperson plural pronouns compared to blogging on other topics. Blogging about daily life decreased as violence in the society increased; when violence waned, there was a resurgence of daily life topics, potentially illustrating how a society returns to normalcy. Author Keywords Blogs, collective identity, crisis, war, crisis informatics

    Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries

    No full text
    corecore